55 resultados para Performance Improvement

em Indian Institute of Science - Bangalore - Índia


Relevância:

100.00% 100.00%

Publicador:

Resumo:

It is well known that extremely long low-density parity-check (LDPC) codes perform exceptionally well for error correction applications, short-length codes are preferable in practical applications. However, short-length LDPC codes suffer from performance degradation owing to graph-based impairments such as short cycles, trapping sets and stopping sets and so on in the bipartite graph of the LDPC matrix. In particular, performance degradation at moderate to high E-b/N-0 is caused by the oscillations in bit node a posteriori probabilities induced by short cycles and trapping sets in bipartite graphs. In this study, a computationally efficient algorithm is proposed to improve the performance of short-length LDPC codes at moderate to high E-b/N-0. This algorithm makes use of the information generated by the belief propagation (BP) algorithm in previous iterations before a decoding failure occurs. Using this information, a reliability-based estimation is performed on each bit node to supplement the BP algorithm. The proposed algorithm gives an appreciable coding gain as compared with BP decoding for LDPC codes of a code rate equal to or less than 1/2 rate coding. The coding gains are modest to significant in the case of optimised (for bipartite graph conditioning) regular LDPC codes, whereas the coding gains are huge in the case of unoptimised codes. Hence, this algorithm is useful for relaxing some stringent constraints on the graphical structure of the LDPC code and for developing hardware-friendly designs.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Clock synchronization in a wireless sensor network (WSN) is quite essential as it provides a consistent and a coherent time frame for all the nodes across the network. Typically, clock synchronization is achieved by message passing using a contention-based scheme for media access, like carrier sense multiple access (CSMA). The nodes try to synchronize with each other, by sending synchronization request messages. If many nodes try to send messages simultaneously, contention-based schemes cannot efficiently avoid collisions. In such a situation, there are chances of collisions, and hence, message losses, which, in turn, affects the convergence of the synchronization algorithms. However, the number of collisions can be reduced with a frame based approach like time division multiple access (TDMA) for message passing. In this paper, we propose a design to utilize TDMA-based media access and control (MAC) protocol for the performance improvement of clock synchronization protocols. The basic idea is to use TDMA-based transmissions when the degree of synchronization improves among the sensor nodes during the execution of the clock synchronization algorithm. The design significantly reduces the collisions among the synchronization protocol messages. We have simulated the proposed protocol in Castalia network simulator. The simulation results show that the proposed protocol significantly reduces the time required for synchronization and also improves the accuracy of the synchronization algorithm.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The twin demands of energy-efficiency and higher performance on DRAM are highly emphasized in multicore architectures. A variety of schemes have been proposed to address either the latency or the energy consumption of DRAMs. These schemes typically require non-trivial hardware changes and end up improving latency at the cost of energy or vice-versa. One specific DRAM performance problem in multicores is that interleaved accesses from different cores can potentially degrade row-buffer locality. In this paper, based on the temporal and spatial locality characteristics of memory accesses, we propose a reorganization of the existing single large row-buffer in a DRAM bank into multiple sub-row buffers (MSRB). This re-organization not only improves row hit rates, and hence the average memory latency, but also brings down the energy consumed by the DRAM. The first major contribution of this work is proposing such a reorganization without requiring any significant changes to the existing widely accepted DRAM specifications. Our proposed reorganization improves weighted speedup by 35.8%, 14.5% and 21.6% in quad, eight and sixteen core workloads along with a 42%, 28% and 31% reduction in DRAM energy. The proposed MSRB organization enables opportunities for the management of multiple row-buffers at the memory controller level. As the memory controller is aware of the behaviour of individual cores it allows us to implement coordinated buffer allocation schemes for different cores that take into account program behaviour. We demonstrate two such schemes, namely Fairness Oriented Allocation and Performance Oriented Allocation, which show the flexibility that memory controllers can now exploit in our MSRB organization to improve overall performance and/or fairness. Further, the MSRB organization enables additional opportunities for DRAM intra-bank parallelism and selective early precharging of the LRU row-buffer to further improve memory access latencies. These two optimizations together provide an additional 5.9% performance improvement.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Fiber-optic CDMA technology is well suited for high speed local-area-networks (LANs) as it has good salient features. In this paper, we model the wavelength/time multiple-pulses-per-row (W/T MPR) FO-CDMA network channel, as a Z channel. We compare the performances of W/T MPR code with and without hard-limiter and show that significant performance improvement can be achieved by using hard-limiters in the receivers. In broadcast channels, MAI is the dominant source of noise. Hence the performance analysis is carried out considering only MAI and other receiver noises are neglected.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

In this paper, based on the temporal and spatial locality characteristics of memory accesses in multicores, we propose a re-organization of the existing single large row buffer in a DRAM bank into multiple smaller row-buffers. The proposed configuration helps improve the row hit rates and also brings down the energy required for row-activations. The major contribution of this work is proposing such a reorganization without requiring any significant changes to the existing widely accepted DRAM specifications. Our proposed reorganization improves performance by 35.8%, 14.5% and 21.6% in quad, eight and sixteen core workloads along with a 42%, 28% and 31% reduction in DRAM energy. Additionally, we introduce a Need Based Allocation scheme for buffer management that shows additional performance improvement.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

In this paper, we report drain-extended MOS device design guidelines for the RF power amplifier (RF PA) applications. A complete RF PA circuit in a 28-nm CMOS technology node with the matching and biasing network is used as a test vehicle to validate the RF performance improvement by a systematic device design. A complete RF PA with 0.16-W/mm power density is reported experimentally. By simultaneous improvement of device-circuit performance, 45% improvement in the circuit RF power gain, 25% improvement in the power-added efficiency at 1-GHz frequency, and 5x improvement in the electrostatic discharge robustness are reported experimentally.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

For the problem of speaker adaptation in speech recognition, the performance depends on the availability of adaptation data. In this paper, we have compared several existing speaker adaptation methods, viz. maximum likelihood linear regression (MLLR), eigenvoice (EV), eigenspace-based MLLR (EMLLR), segmental eigenvoice (SEV) and hierarchical eigenvoice (HEV) based methods. We also develop a new method by modifying the existing HEV method for achieving further performance improvement in a limited available data scenario. In the sense of availability of adaptation data, the new modified HEV (MHEV) method is shown to perform better than all the existing methods throughout the range of operation except the case of MLLR at the availability of more adaptation data.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Clustered VLIW architectures solve the scalability problem associated with flat VLIW architectures by partitioning the register file and connecting only a subset of the functional units to a register file. However, inter-cluster communication in clustered architectures leads to increased leakage in functional components and a high number of register accesses. In this paper, we propose compiler scheduling algorithms targeting two previously ignored power-hungry components in clustered VLIW architectures, viz., instruction decoder and register file. We consider a split decoder design and propose a new energy-aware instruction scheduling algorithm that provides 14.5% and 17.3% benefit in the decoder power consumption on an average over a purely hardware based scheme in the context of 2-clustered and 4-clustered VLIW machines. In the case of register files, we propose two new scheduling algorithms that exploit limited register snooping capability to reduce extra register file accesses. The proposed algorithms reduce register file power consumption on an average by 6.85% and 11.90% (10.39% and 17.78%), respectively, along with performance improvement of 4.81% and 5.34% (9.39% and 11.16%) over a traditional greedy algorithm for 2-clustered (4-clustered) VLIW machine. (C) 2010 Elsevier B.V. All rights reserved.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

FDDI (Fibre Distributed Data Interface) is a 100 Mbit/s token ring network with two counter rotating optical rings. In this paper various possible faults (like lost token, link failures, etc.) are considered, and fault detection and the ring recovery process in case of a failure and the reliability mechanisms provided are studied. We suggest a new method to improve the fault detection and ring recovery process. The performance improvement in terms of station queue length and the average delay is compared with the performance of the existing fault detection and ring recovery process through simulation. We also suggest a modification for the physical configuration of the FDDI networks within the guidelines set by the standard to make the network more reliable. It is shown that, unlike the existing FDDI network, full connectivity is maintained among the stations even when multiple single link failures occur. A distributed algorithm is proposed for link reconfiguration of the modified FDDI network when many successive as well as simultaneous link failures occur. The performance of the modified FDDI network under link failures is studied through simulation and compared with that of the existing FDDI network.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

In this paper, we present a new speech enhancement approach, that is based on exploiting the intra-frame dependency of discrete cosine transform (DCT) domain coefficients. It can be noted that the existing enhancement techniques treat the transformdomain coefficients independently. Instead of this traditional approach of independently processing the scalars, we split the DCT domain noisy speech vector into sub-vectors and each sub-vector is enhanced independently. Through this sub-vector based approach, the higher dimensional enhancement advantage, viz. non-linear dependency, is exploited. In the developed method, each clean speech sub-vector is modeled using a Gaussian mixture (GM) density. We show that the proposed Gaussian mixture model (GMM) based DCT domain method, using sub-vector processing approach, provides better performance than the conventional approach of enhancing the transform domain scalar components independently. Performance improvement over the recently proposed GMM based time domain approach is also shown.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The IEEE 802.16/WiMAX standard has fully embraced multi-antenna technology and can, thus, deliver robust and high transmission rates and higher system capacity. Nevertheless,due to its inherent form-factor constraints and cost concerns, a WiMAX mobile station (MS) should preferably contain fewer radio frequency (RF) chains than antenna elements.This is because RF chains are often substantially more expensive than antenna elements. Thus, antenna selection, wherein a subset of antennas is dynamically selected to connect to the limited RF chains for transceiving, is a highly appealing performance enhancement technique for multi-antenna WiMAX terminals.In this paper, a novel antenna selection protocol tailored for next-generation IEEE 802.16 mobile stations is proposed. As demonstrated by the extensive OPNET simulations, the proposed protocol delivers a significant performance improvement over conventional 802.16 terminals that lack the antenna selection capability. Moreover, the new protocol leverages the existing signaling methods defined in 802.16, thereby incurring a negligible signaling overhead and requiring only diminutive modifications of the standard. To the best of our knowledge, this paper represents the first effort to support antenna selection capability in IEEE 802.16 mobile stations.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The prevalent virtualization technologies provide QoS support within the software layers of the virtual machine monitor(VMM) or the operating system of the virtual machine(VM). The QoS features are mostly provided as extensions to the existing software used for accessing the I/O device because of which the applications sharing the I/O device experience loss of performance due to crosstalk effects or usable bandwidth. In this paper we examine the NIC sharing effects across VMs on a Xen virtualized server and present an alternate paradigm that improves the shared bandwidth and reduces the crosstalk effect on the VMs. We implement the proposed hardwaresoftware changes in a layered queuing network (LQN) model and use simulation techniques to evaluate the architecture. We find that simple changes in the device architecture and associated system software lead to application throughput improvement of up to 60%. The architecture also enables finer QoS controls at device level and increases the scalability of device sharing across multiple virtual machines. We find that the performance improvement derived using LQN model is comparable to that reported by similar but real implementations.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Allgather is an important MPI collective communication. Most of the algorithms for allgather have been designed for homogeneous and tightly coupled systems. The existing algorithms for allgather on Gridsystems do not efficiently utilize the bandwidths available on slow wide-area links of the grid. In this paper, we present an algorithm for allgather on grids that efficiently utilizes wide-area bandwidths and is also wide-area optimal. Our algorithm is also adaptive to gridload dynamics since it considers transient network characteristics for dividing the nodes into clusters. Our experiments on a real-grid setup consisting of 3 sites show that our algorithm gives an average performance improvement of 52% over existing strategies.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Performance improvement of a micromachined patch antenna operating at 30 GHz with a capacitively coupled feed arrangement is presented here. Such antennas are useful for monolithic integration with active components. Specifically, micromachining can be employed to achieve a low dielectric constant region under the patch which causes (i) the suppression of surface waves and hence the increase in radiation efficiency and (ii) increase in the bandwidth. The performance of such a patch antenna can be significantly improved by selecting a coupled feed arrangement. We have optimized the dimensions and location of the capacitive feeding strip to get the maximum improvement in bandwidth. Since this is a totally planar arrangement, and does not involve any stacked structures, this antenna is easy to fabricate using standard microfabrication techniques. The antenna element thus designed has a -10 dB bandwidth of 1600 MHz